10,574 research outputs found
Newton slopes for Artin-Schreier-Witt towers
We fix a monic polynomial over a finite field and
consider the Artin-Schreier-Witt tower defined by ; this is a tower of
curves , with total
Galois group . We study the Newton slopes of zeta functions of
this tower of curves. This reduces to the study of the Newton slopes of
L-functions associated to characters of the Galois group of this tower. We
prove that, when the conductor of the character is large enough, the Newton
slopes of the L-function form arithmetic progressions which are independent of
the conductor of the character. As a corollary, we obtain a result on the
behavior of the slopes of the eigencurve associated to the Artin-Schreier-Witt
tower, analogous to the result of Buzzard and Kilford.Comment: 15 pages, upon the refereed version (to appear in Math. Ann), we
fixed two minor errors, one in the proof of Theorem 3.8, the other for
Theorem 4.
Masked Language Model Scoring
Pretrained masked language models (MLMs) require finetuning for most NLP
tasks. Instead, we evaluate MLMs out of the box via their pseudo-log-likelihood
scores (PLLs), which are computed by masking tokens one by one. We show that
PLLs outperform scores from autoregressive language models like GPT-2 in a
variety of tasks. By rescoring ASR and NMT hypotheses, RoBERTa reduces an
end-to-end LibriSpeech model's WER by 30% relative and adds up to +1.7 BLEU on
state-of-the-art baselines for low-resource translation pairs, with further
gains from domain adaptation. We attribute this success to PLL's unsupervised
expression of linguistic acceptability without a left-to-right bias, greatly
improving on scores from GPT-2 (+10 points on island effects, NPI licensing in
BLiMP). One can finetune MLMs to give scores without masking, enabling
computation in a single inference pass. In all, PLLs and their associated
pseudo-perplexities (PPPLs) enable plug-and-play use of the growing number of
pretrained MLMs; e.g., we use a single cross-lingual model to rescore
translations in multiple languages. We release our library for language model
scoring at https://github.com/awslabs/mlm-scoring.Comment: ACL 2020 camera-ready (presented July 2020
Inducing Neural Collapse to a Fixed Hierarchy-Aware Frame for Reducing Mistake Severity
There is a recently discovered and intriguing phenomenon called Neural
Collapse: at the terminal phase of training a deep neural network for
classification, the within-class penultimate feature means and the associated
classifier vectors of all flat classes collapse to the vertices of a simplex
Equiangular Tight Frame (ETF). Recent work has tried to exploit this phenomenon
by fixing the related classifier weights to a pre-computed ETF to induce neural
collapse and maximize the separation of the learned features when training with
imbalanced data. In this work, we propose to fix the linear classifier of a
deep neural network to a Hierarchy-Aware Frame (HAFrame), instead of an ETF,
and use a cosine similarity-based auxiliary loss to learn hierarchy-aware
penultimate features that collapse to the HAFrame. We demonstrate that our
approach reduces the mistake severity of the model's predictions while
maintaining its top-1 accuracy on several datasets of varying scales with
hierarchies of heights ranging from 3 to 12. We will release our code on GitHub
in the near future
Latest Cosmological Constraints on Cardassian expansion models including the updated Gamma-ray bursts
In this paper, we constrain the Cardassian expansion models from the latest
observations including the updated Gamma-ray bursts (GRBs), which calibrated
cosmology-independently from the Union2 compilation of type Ia supernovae (SNe
Ia). By combining the GRB data to the joint observations with the Union2 SNe Ia
set, along with the Cosmic Microwave Background radiation observation from the
seven-year Wilkinson Microwave Anisotropy Probe result, the baryonic acoustic
oscillation observation from the spectroscopic Sloan Digital Sky Survey Data
Release galaxy sample, we find significant constraints on model parameters of
the original Cardassian model , ; and ,
of the modified polytropic Cardassian model, which
are consistent with the CDM model in 1- confidence region.
From the reconstruction of the deceleration parameter in Cardassian
models, we obtain the transition redshift for the
original Cardassian model, and for the modified
polytropic Cardassian model.Comment: 11 pages, 5 figures, 1 table; accepted for publication in Res.
Astron. Astrophy
- …